We study the adaptation of Link Grammar Parser to the biomedical sublanguagewith a focus on domain terms not found in a general parser lexicon. Using twobiomedical corpora, we implement and evaluate three approaches to addressingunknown words: automatic lexicon expansion, the use of morphological clues, anddisambiguation using a part-of-speech tagger. We evaluate each approachseparately for its effect on parsing performance and consider combinations ofthese approaches. In addition to a 45% increase in parsing efficiency, we findthat the best approach, incorporating information from a domain part-of-speechtagger, offers a statistically signicant 10% relative decrease in error. Theadapted parser is available under an open-source license athttp://www.it.utu.fi/biolg.
展开▼